University of Alberta A GENERAL FRAMEWORK FOR REDUCING VARIANCE IN AGENT EVALUATION by

نویسنده

  • Martha White
چکیده

In this work, we present a unified, general approach to variance reduction in agent evaluation using machine learning to minimize variance. Evaluating an agent’s performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evaluation is done using Monte Carlo estimation (sample averages); the magnitude of the stochasticity in the domain or the high cost of sampling, however, can often prevent the approach from resulting in statistically significant conclusions. Recently, an advantage sum technique based on control variates has been proposed for constructing unbiased, low variance estimates of agent performance. The technique requires an expert to define a value function over states of the system, essentially a guess of the state’s unknown value. In this work, we propose learning this value function from past interactions between agents in some target population. Our learned value functions have two key advantages: they can be applied in domains where no expert value function is available and they can result in tuned evaluation for a specific population of agents (e.g., novice versus advanced agents). This work has three main contributions. First, we consolidate previous work in using control variates for variance reduction into one unified, general framework and summarize the connections between this previous work. Second, our framework makes variance reduction practically possible in any sequential decision making task where designing the expert value function is time-consuming, difficult or essentially impossible. We prove the optimality of our approach and extend the theoretical understanding of advantage sum estimators. In addition, we significantly extend the applicability of advantage sum estimators and discuss practical methods for using our framework in real-world scenarios. Finally, we provide low-variance estimators for three poker domains previously without variance reduction and improve strategy selection in the expert-level University of Alberta poker bot. This work is an elaboration of published work [White and Bowling, 2009].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Portfolio Performance Evaluation in a Modified Mean-Variance-Skewness Framework with Negative Data

   The present study is an attempt toward evaluating the performance of portfolios using mean-variance-skewness model with negative data. Mean-variance non-linear framework and mean-variance-skewness non- linear framework had been proposed based on Data Envelopment Analysis, which the variance of the assets had been used as an input to the DEA and expected return and skewness were the output. C...

متن کامل

A Practical Self-Assessment Framework for Evaluation of Maintenance Management System based on RAMS Model and Maintenance Standards

A set of technical, administrative and management activities are done in the life cycle of equipment, to be located in good condition and have proper and expected functioning. This is refers to be, maintenance management system (MMS). The framework and models of assessment in order to enhance effectiveness of a MMS could be proposed in two categories: qualitative and quantitative. In this resea...

متن کامل

Evaluation of de-desertification alternatives in Ardakan-khezr abad plain by using shannon entropy method and oreste model

Abstract Nowadays, Desertification is one of the greatest environmental challenges. It is a global issue and its serious consequences affect on biodiversity, environmental safety, poverty eradication, economic and social stability and sustainable development around the world. Despite the serious environmental, social and economic impact of desertification phenomenon, few studies have been done...

متن کامل

A Unique Mathematical Framework for Optimizing Patient Satisfaction in Emergency Departments

In healthcare systems, emergency departments (EDs) are the most vital elements, in that they provide critical and immediate healthcare services to the patients 24 hours a day. Patient satisfaction is a crucial concept and a practical tool for evaluating the performance of the EDs. This study presents a unique framework for the performance optimization of an emergency department in a big general...

متن کامل

Comparison of Self, Peer, and Clinical Teacher Evaluation in Clinical Skills Evaluation Process of Midwifery Students

Introduction: In order to have a precise judgment on performance, a variety of data resources are needed. In this regard, experts emphasize on employing different assessment groups for determining achievement of learning objectives by students. This study was performed to compare self, peer, and teacher evaluation in the process of midwifery students' clinical skills evaluation at delivery roo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010